System and method for determining amino acid sequence of polypeptide

ABSTRACT

This invention discloses systems and methods for determining the sequence of amino acids in a short peptide chain that constructs a protein. The protein is firstly hydrolyzed to various short peptides and amino acid enantiomers. Then, the systems and method are used to separate the short peptides and the amino acid enantiomers, identify qualitatively each of the amino acid enantiomers, and obtain the molecular mass signal for each of the peptides. After that, the identified amino acid enantiomers are used to construct any possible short peptides in an order from the smallest molecular weight dipeptide to higher molecular weight short peptides, and the correct short peptides is confirmed by matching the molecular weight obtained from the mass spectrometry measurement, then, the short peptides are combined to give a large peptide. The process is continued until the whole amino acid sequence of the peptide chain of protein can be determined.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to Taiwan Patent Application No.101148182, filed Dec. 18, 2012, the entire contents of which areincorporated herein by reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to systems and methods for determiningamino acid sequence of proteins or polypeptides.

2. Description of Related Art

Proteins are large organic molecules consisting of one or morepolypeptide chains of amino acids. The backbone of polypeptide is linkedby many peptide bonds which are formed between two adjacent amino acidsby the dehydration of a carboxyl group of one amino acid and an aminegroup of the other amino acid. Polypeptides differ from one anotherprimarily in their amino acid sequence. The peptide formed by two aminoacids is called a “dipeptide,” the peptide formed by three amino acidsis called a “tripeptide,” and so on.

Because the amino acid sequence determines the properties and biologicalfunctions of the proteins, it is important to find out the correct aminoacid sequence of the protein [1]. In 1955, England biochemist Sanger hadsuccessfully determined the amino acid sequence of insulin and provedthat the sequence is correct [2]. In addition, Perutz and Kendrew haddetermined the amino acid sequence of proteins by X-ray crystallographysince 1958 [3-4].

Amino acids are the basic unit of proteins and are produced byfermentation, artificial synthesis, or hydrolysis of proteins. All aminoacids hydrolyzed from natural proteins are α-amino acids, and typicallythe term “amino acids” used in biochemistry refers to α-amino acidswhile β-amino acids and γ-amino acids are used in the field of organicsynthesis, petroleum chemical industry, and medical science. Table 1lists 20 common amino acids found in natural proteins.

TABLE 1 -log(side Dissociation Dissociation chain constant constantdissociation Molecular Isoelectric (carboxyl (amino constant) NameAbbreviation Side chain weight point group) group) (pK_(R)) Glycine GGly Hydrophilic 75.07 6.06 2.35 9.78 Alanine A Ala Hydrophobic 89.096.11 2.35 9.87 Valine V Val Hydrophobic 117.15 6 2.39 9.74 Leucine L LeuHydrophobic 131.17 6.01 2.33 9.74 Isoleucine I Ile Hydrophobic 131.176.05 2.32 9.76 Phenylalanine F Phe Hydrophobic 165.19 5.49 2.2 9.31Tryptophan W Trp Hydrophobic 204.23 5.89 2.46 9.41 Tyrosine Y TyrHydrophilic 181.19 5.64 2.2 9.21 10.46 Aspartic acid D Asp Acid 133.12.85 1.99 9.9 3.9 Histidine H His Alkaline 155.16 7.6 1.8 9.33 6.04Asparagine N Asn Hydrophilic 132.12 5.41 2.14 8.72 Glutamic E Glu Acid147.13 3.15 2.1 9.47 4.07 acid Lysine K Lys Alkaline 146.19 9.6 2.169.06 10.54 Glutamine Q Gln Hydrophilic 146.15 5.65 2.17 9.13 MethionineM Met Hydrophobic 149.21 5.74 2.13 9.28 Arginine R Arg Alkaline 174.210.76 1.82 8.99 12.48 Serine S Ser Hydrophilic 105.09 5.68 2.19 9.21Threonine T Thr Hydrophilic 119.12 5.6 2.09 9.1 Cysteine C CysHydrophilic 121.16 5.05 1.92 10.7 8.37 Proline P Pro Hydrophobic 115.136.3 1.95 10.64

Except glycine, all α-amino acids have asymmetric carbon, and thus eachof them has two enantiomers with opposite optical rotations, i.e.,dextrorotatory (D) and levorotatory (L). Typically the proteins orpolypeptides of organisms are constructed by levorotatory amino acids.However, exceptions may be found, for instance, tyrocidine andgramicidine also include dextrorotatory amino acids.

The hydrolysis of polypeptides may generate individual constituent aminoacid residues and their enantiomers and various peptides of differentlengths. Conventional high-performance liquid chromatography (HPLC) canbe used for partial separation of a few hydrolytes [5-7], but fails toseparate them all.

To determine the amino acid sequence, in 1984 Biemann et al. [8-9] usedata from mass spectrometry to confirm the relationship between theamino acid sequence and nucleic acid sequence. In this work, proteinsare hydrolyzed into peptide fragments by the mediation of trypsin,meanwhile high-performance liquid chromatography (HPLC) is used toseparate peptide fragments and a fast atom bombardment-mass spectrometry(FAB-MS) is used to analyze the mass of the peptide fragments. Theanalysis data of FAB-MS is compared to all of the possible nucleic acidsequences, so as to confirm the relationship between the amino acidsequence and the nucleic acid sequence. At the same time, Edman developsan Edman sequencer [10-11] to determine amino acid sequence of proteinsby hydrolyzing the polypeptide chain in order from N-terminal toC-terminal. Edman's method suffers from long analyzing time, poorsensitivity, and unable to separate amino acid enantiomers.

REFERENCES

[1] Bruce Alberts, Alexander Johnson, Julian Lewis, Martin Raff, KeithRobers, Peter Walter. Molecular biology of the cell, 4^(th) ed. GarlandScience, New York. 2002; [2] Laylin K. James, .Nobel Laureates inChemistry 1901-199.: American Chemical Society; Chemical HeritageFoundation. Washington, D.C., 1993; [3] H. Muirhead, M. F. Perutz.“Structure of hemoglobin, three-dimensional fourier synthesis of reducedhuman hemoglobin at 5.5-A. resolution,” Nature, 199(4894): 633-638.1963; [4] J. Kendrew, G. Bodo, H. Dintzis, R. Parrish, H. Wyckoff, D.Phillips. “Three-dimensional model of the myoglobin molecule obtained byx-ray analysis,” Nature, 181(4610): 662-666, 1958; [5] T. Ueno, M.Tanaka, T. Mastui, K. Mtasumoto. “Determination of antihypertensivesmall peptides, Val-Tyr and Ile-Val-Tyr, by fluorometrichigh-performance liquid chromatography combined with a double heart-cutcolumn switching technique,” Analytical Science, 21, 997-1000, 2005; [6]M. Gilar, P. Olivova, A. E. Daly, J. C. Gebler. “Two-dimensionalseparation of peptides using RP-RP-HPLC system with different pH infirst and second separation dimensions,” Journal of Separation Science28, 1694-1703, 2005; [7] H. J. Issaq, K. C. Chan, J. Blonder, X. Ye, T.D. Veenstra. “Separation, detection and quantitation of peptides byliquid chromatography and capillary electrochromatography,” Journal ofChromatography A, 1216, 1858-1837, 2009; [8] Chung, Deborah D. L. TheRoad to Scientific Success: Inspiring Life Stories of ProminentResearchers (Road to Scientific Success). World Scientific PublishingCompany. 2006; [9] Gibson B. W. and Biemann K. “Strategy for the massspectrometric verification and correction of the primary structures ofproteins deduced from their DNA sequences,” Proceedings of the NationalAcademy of Sciences. 81, 1956-1960, 1984; [10] M. Kai*, M. Morizono, M.N. Wainaina, T. Kabashima, “Chemileuminescence detection of amino acidsusing an Edman-type reagent, 4-(1-cyanoisoindolyl)phenylisothiocyanate.” Analytica Chimica Acta 535, 153-159, 2005; [11]Niall H. D. “Automated Edman degradation: the protein sequenator.” Meth.Enzymol. 1973, 27: 942-1010.

SUMMARY OF THE INVENTION

An object of the present invention is to provide methods and systems todetermine the amino acid sequence of polypeptides and to distinguish theenantiomers of amino acids in a fast, effective manner.

One embodiment of this invention provides a system to determine theamino acid sequence of a protein or a polypeptide. The protein orpolypeptide is firstly thermally hydrolyzed to a hydrolyte, whichcomprises individual constituent amino acids (including enantiomers), avariety of short peptides constructed by the amino acids, andun-hydrolyzed protein or polypeptide. The system comprises a firstcolumn, a second column, and a third column. The first column connectsto an ultraviolet detector, so as to separate the amino acids and shortpeptides. The second column connects to a fluorescence detector, so asto identify the amino acid enantiomers. The third column connects to amass spectrometer, so as to identify the short peptides and the aminoacid cysteine through the molecular weight signal (m/z) of massspectrometry. The identified amino acid enantiomers are used toconstruct any possible short peptides in an order from the smallestmolecular weight dipeptide to higher molecular weight short peptides,and the correct short peptides is confirmed by matching the molecularweight signal (m/z) obtained from the mass spectra. Then, the confirmedshort peptides are combined to give a large peptide. The process iscontinued until the whole amino acid sequence of the polypeptide orprotein can be determined.

Another embodiment of this invention provides a method to determine theamino acid sequence of a protein or a polypeptide, the methodcomprising: (1) thermally hydrolyzing the protein or the polypeptide toa hydrolyte comprising constituent amino acids (including enantiomers),a variety of short peptides constructed by the amino acid enantiomers,and un-hydrolyzed protein or polypeptide; (2) separating the amino acidenantiomers and the short peptides; (3) identifying the amino acidenantiomers; (4) identifying the short peptides using a massspectrometer through the molecular weight signal (m/z) of mass spectra;(5) constructing any possible dipeptides by the identified amino acidenantiomers, and confirming the possible dipeptides by matching themolecular weight obtained from the mass spectra; (6) constructing anypossible tripeptides by the confirmed dipeptides, and confirming thepossible tripeptides by matching the molecular weight obtained from themass spectra; (7) constructing any possible larger peptides with atleast one more amino acid enantiomer residue by the confirmed shortpeptides (i.e., confirmed dipeptides and tripeptides), and confirmingthe possible larger peptides by matching the molecular weight obtainedfrom the mass spectra; wherein step (7) is continually performed untilnone of the possible larger peptides can be confirmed by the molecularweight signal (m/z) of mass spectra, and whereby the amino acid sequenceof the protein or the polypeptide is determined.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A-1F show a method and system for determining the amino acidsequence of a polypeptide according to a preferred embodiment of thepresent invention.

FIG. 2 shows the chromatogram of the first column according to thepreferred embodiment of the present invention.

FIG. 3 shows the chromatogram of the second column according to thepreferred embodiment of the present invention.

FIG. 4 shows the chromatogram of the second column according to thepreferred embodiment, in which 24 standard amino acid enantiomers areseparated by the second column.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

Reference will now be made in detail to those specific embodiments ofthe invention. Examples of these embodiments are illustrated inaccompanying drawings. While the invention will be described inconjunction with these specific embodiments, it will be understood thatit is not intended to limit the invention to these embodiments. On thecontrary, it is intended to cover alternatives, modifications, andequivalents as may be included within the spirit and scope of theinvention as defined by the appended claims. In the followingdescription, numerous specific details are set forth in order to providea thorough understanding of the present invention. The present inventionmay be practiced without some or all of these specific details. In otherinstances, well-known process operations and components are notdescribed in detail in order not to unnecessarily obscure the presentinvention. While drawings are illustrated in detail, it is appreciatedthat the quantity of the disclosed components may be greater or lessthan that disclosed, except where expressly restricting the amount ofthe components. Wherever possible, the same or similar reference numbersare used in drawings and the description to refer to the same or likeparts.

FIGS. 1A-1F show a system and method for determining the amino acidsequence of a polypeptide or a protein. The system comprises a firstcolumn 10, a second column 12, a third column 14, a first detector 20(ultraviolet detector 20), a second detector 22 (fluorescence detector22), and a third detector 24 (mass spectrometer 24). In addition, thesystem further comprises a first pump 40, a second pump 42, a third pump44, a fourth pump 46, and an injection syringe 48 for conveying orinjecting a first mobile phase 50, a second mobile phase 52, a thirdmobile phase 54, a fluorescence derivatization agent 56, and a solvent58 to the corresponding columns, fluorescence derivatization coil 70,sample loop 72, and the corresponding detectors. Further, the detectors20/22/24 are connected to computer for the analysis work.

In this preferred embodiment, the first column 10 is an affinity chiralcolumn (Astec ChiroBiotic™ T, 250 mm×4.6 mm I.D., particle diameter 5μm) with a guard column ChiroBiotic™ T (30 mm×4.6 mm I.D., particlediameter 5 μm), purchased from Supelco (Bellefonte, U.S.A.). The secondcolumn 12 is a ligand-exchange column (Phenomenex Chirex3126(D)-penicillamine, 250 mm×4.6 mm I.D., particle diameter 5 μm), witha guard column Chirex 3126(D)-penicillamine (30 mm×4.6 mm I.D., particlediameter 5 μm), purchased from Phenomenex (Torrance, U.S.A.). The thirdcolumn 14 is a reversed phase column (Zorbax Eclipse XDB-C8, 150 mm×4.6mm I.D., particle diameter 5 μm), with a guard column Zorbax EclipseXDB-C8 (12.5 mm×4.6 mm I.D., particle diameter 5 μm), purchased fromAgilent (Waldbronn, Germany).

In this preferred embodiment, the mass spectrometer 24 is an ion trapmass spectrometer (Brucker Daltonics, Esquire 2000, Billerica, U.S.A.)coupled with an Electrospray Ionization Interface (ESI).

In this preferred embodiment, both the first mobile phase 50 and thesecond mobile phase 52 are 2 mM CuSO₄/MeOH solution with a volume ratio(v/v) 90/10, and the third mobile phase 54 and the solvent 58 are 100%methanol. The fluorescence derivatization agent 56 is prepared asfollows. Firstly, 900 mL of deionized distilled water and 3.8138 g ofNa₂B₄O₇.10H₂O are added in a container to form a solution. Then 5 mMNaOH aqueous solution is used to adjust the pH of the solution to 9.5.Then deionized distilled water is added to the solution till the totalvolume of the solution is 1000 mL, and hence a 0.01 M borate buffersolution is prepared. After that, 2.146 g of o-phthaldialdehyde (OPA)and 1 mL of mercaptoethanol (C₂H₆OS) are added to the buffer solution,and the solution is shaken in an orbital-shaking incubator at 30° C.,150 rpm for one day, such that the fluorescence derivatization agent 56is prepared. The fluorescence derivatization agent 56 is used toderivatize the amino acids so that they can be analyzed by thefluorescence detector 22.

According to the embodiment, a protein or a polypeptide under test isneeded to be thermally hydrolyzed by the following procedure. 1 mL ofthe 1000 ppm standard protein or polypeptide solution is taken andplaced into one well of a 20-well array platform reactor which iscontrolled at a predetermined temperature. The reaction time is about 1day to 4 days. After the hydrolysis, the hydrolyte is taken out anddeionized distilled water is added to the hydrolyte so as to dilute theconcentration by 10-fold. A syringe filter is used to filter thehydrolyte. The filtrate will be used later.

It should be noted that the temperature for the hydrolysis can becontrolled so that the protein or polypeptide is partially hydrolyzedrather than completely hydrolyzed. For example, if the protein or thepolypeptide is a tripeptide, the hydrolysis temperature is controlled sothat it is hydrolyzed to an un-hydrolyzed tripeptide, two kinds ofdipeptide, and three kinds of amino acid enantiomers.

The procedure for determining the amino acid sequence of the protein orpolypeptide is described as follows. As shown in FIG. 1A, theabove-mentioned filtered hydrolyte is injected into the first column 10via a syringe injection valve 60 to separate amino acids and shortpeptides of the hydrolyte. Then, as shown in FIG. 1B, when the aminoacids will be eluted out of the first column 10, the valve 30 isswitched to connect the first column 10 and the second column 12 inseries, and the amino acids in the hydrolyte flow into the second column12. The second column 12 separates the amino acid enantiomers, and thefluorescence derivatization agent (OPA) 56 reacts with the amino acidenantiomers to transfer them in a form for being analyzed by thefluorescence detector 22. As shown in FIG. 1C, when the amino acidenantiomers completely flow into the second column 12, the valve 30 isswitched back to its original position. As shown in FIG. 1D, when theshort peptides will be eluted out from the first column 10, the valve 32is switched, such that the short peptides can flow into the samplingloop 72. Then, as shown in FIG. 1E, when the short peptides completelyflow into the sampling loop 72, the valve 32 and valve 34 aresimultaneously switched, such that the third mobile phase 54 (100%methanol) can carry the short peptides in the sampling loop 72 into thethird column 14 to separate the short peptides and the copper ions. Atthis time, the third mobile phase elutes the sulphate ions and copperions first out of the third column 14 to flow into the waste collectionbottle, and the injection syringe 48 continually injects methanol 58into the mass spectrometer 24 because the third column 14 is not yetconnected to the mass spectrometer 24. As shown in FIG. 1F, after waitabout 30 seconds, the valve 34 is switched, so that the third column 14and the mass spectrometer 24 are connected in series and the shortpeptides out from the third column 14 can be analyzed by massspectrometer 24.

The enantiomers of amino acids are detected by the fluorescence detector22 whose excitation wavelength is 340 nm and emission wavelength is 450nm; the amino acids and the short peptides are detected by theultraviolet detector at wavelength 254 nm. The mass spectrometer 24 isan ion trap mass spectrometer with Electrospray Ionization Interface(ESI) in which both the nebulizing gas and the drying gas are nitrogen,the pressure and flow rate for the nebulizing gas are 20.0 psi and 5 Lmin⁻¹, respectively, and the temperature of the drying gas is 300° C.

The mass spectrum signal (m/z) was detected in a positive ion mode. Thecapillary inlet voltage and outlet voltage, the skimmer 1 voltage, andthe ion trap driving voltage are set as 4500, 38.2, 31.5, and 36.3 V,respectively. The mass-to-charge ratio (m/z) is set at a range between50 and 1000. Because the flow rate (1 mL min⁻¹) of the mobile phase 54out from the third column 14 is too large for the ESI, a flow ratesplitter is used to lower the flow rate of the eluent into the ESI.

In this embodiment, the protein or polypeptide is thermally hydrolyzedto short peptides and amino acids, and a three-dimensional HPLC is usedto separate them step by step. In addition, the enantiomers of the aminoacids can be separated and used for the determination of amino acidsequence as well. In particular, the first column 10 is used to separatethe short peptides and the amino acids, the second column 12 is used toseparate the enantiomers of amino acids, and the third column is used toseparate the short peptides, copper ions, and sulfate ions, and when themobile phase is changed to methanol, the mass spectrometer 24 is used toanalyze short peptides and cysteine.

Because the short peptides and the amino acids have similar structure,polarity, size, and physical properties, the selection of suitable firstcolumn 10 is difficult. In this embodiment, four different columns havebeen tested to separate standard short peptides. They are EclipseXDB-C8, Juipter C4, Chromolith® RP-18e, and Astec ChiroBiotic™ T. Inthis embodiment, the polypeptide to be determined is glutathione. Afterthe experiments, only Astec ChiroBiotic™ T can separate the amino acidsand short peptides produced from glutathione hydrolsis. In addition, itis found that a low concentration of copper ions should be added in themobile phase to increase the selectivity of the column.

FIG. 2 shows the chromatogram of the first column 10 with differentswitching time, in which the five peaks respectively represent: peak 1,L-glutamic acid (Glu); peak 2, glycine (Gly); peak 3, dipeptide Glu-Gly;peak 4, dipeptide Cys-Gly; peak 5, glutathione. In addition, the secondswitching time of valve 30 is at: A, 0.0 min; B, 10.5 min; C, 10.6 min;D, 10.7 min; and E, 10.8 min.

FIG. 3 shows the chromatogram of the second column 12, in which thethree peaks respectively represent: peak 1, glycine (Gly); peak 2,L-glutamic acid (L-Glu); peak 3, D-glutamic acid (D-Glu). The secondcolumn 12 is Chirex 3126(D)-penicillamine. The copper ions of the mobilephase and the enantiomers of the amino acids respectively form complexcompounds with different stability, which can proceed the exchange ofligand with the packed single chiral enantiomer within the second column12, so as to separate the enantiomers of amino acids. The experimentalresults show that if the concentration of methanol in the mobile phaseis gradually increased, the analysis time is gradually decreased, butthe separation efficiency is gradually decreased as well. After someexperiments, the concentration of methanol is determined to be 10% (v/v)in the mobile phase.

In this embodiment, the switching times of the valves are important. Ifthe switching times are improper, a part of the sample may be lost,resulting in lower sensitivity and causing analysis error. Therefore thecolumns should be switched at proper time. In this embodiment, after thehydrolyte is separated by the first column 10, several switching timesare tested according to the peak positions and their retention times.Then the short peptides and the enantiomers are detected individually bythe fluorescence detector 22 and the peak area of them is calculated.The statistical method One-way Analysis of Variance (ANOVA) is used tocompare the peak areas obtained from the different switching time andfollowed by the least significant test to determine the optimumswitching time. In this embodiment, the protein or polypeptide to betest is glutathione, and after a series of experiments, it is determinedthat the valve 30 is firstly switched at 7.0 min and secondly switchedat 10.7 min.

To investigate the capability of separating enantiomers by the secondcolumn 12, the second column 12 is used to isocratically separate 20common amino acids and their dextrorotatory (D) and levorotatory (L)enantiomers by grouping them into three groups so that they can beresolved within each group. Table 2 lists the result. Most enantiomershave a resolution greater than or approaching to 1.0; therefore thesecond column 12 has an excellent capability to separate the enantiomersof the amino acids. However, because cysteine has a thiol group (—SH)which may form precipitate with copper ions, the second column 12 cannotidentify cysteine. After that, according to the retention times, the 20common dextrorotatory (D) and levorotatory (L) enantiomers are dividedinto three groups. One or more enantiomers of each group, whose peaksare completely resolved by isocratic elution, are selected, mixed, andeluted by gradient elution, so as to reduce the analysis time. Accordingto the chromatogram of the gradient elution, other enantiomers are addedand separated by the gradient elution with same conditions. FIG. 4 showsthe final chromatogram in which 24 enantiomers of amino acids can besimultaneously separated by the second column. The 24 enantiomers ofamino acids are: (1) L-Lys, (2) D-Lys, (3) D-Arg, (4) Gly, (5) L-Ala,(6) D-Ser, (7) D-Thr, (8) D-Gln, (9) L-Pro, (10) L-Val, (11) L-His, (12)D-Pro, (13) D-Val, (14) L-Met, (15) L-Asp, (16) L-Ile, (17) D-Asp, (18)L-Glu, (19) D-Glu, (20) D-Leu, (21) L-Phe, (22) D-Phe, (23) L-Trp, (24)D-Trp.

TABLE 2 L-(retention D-(retention Name abbreviation Side chain time)^(a)time) resolution Glycine G Gly Hydrophilic 5.50 — Alanine A AlaHydrophobic 5.77 7.14 3.04 Valine V Val Hydrophobic 12.37 19.16 4.68Leucine L Leu Hydrophobic 44.08 46.94 1.10 Isoleucine I Ile Hydrophobic26.86 30.53 1.33 Phenylalanine F Phe Hydrophobic 78.31 109.43 4.45Tryptophan W Trp Hydrophobic 151.53 226.34 9.45 Tyrosine Y TyrHydrophilic 25.38 31.22 1.98 Aspartic acid D Asp Acid 24.52 30.99 3.5Histidine H His Alkaline 15.73 19.33 2.58 Asparagine N Asn Hydrophilic6.011 6.003 — Glutamic acid E Glu Acid 41.46 45.58 1.29 Lysine K LysAlkaline 3.73 4.14 1.00 Glutamine Q Gln Hydrophilic 6.05 7.03 1.85Methionine M Met Hydrophobic 21.70 27.49 2.14 Arginine R Arg Alkaline4.19 4.94 1.39 Serine S Ser Hydrophilic 5.84 6.21 0.74 Threonine T ThrHydrophilic 6.31 6.94 0.90 Cysteine C Cys Hydrophilic — — — Proline PPro Hydrophobic 7.63 16.69 8.05 ^(a)Retention time is an average afterfour measurements. ^(b)Separation conditions: Column temperature 40° C.,sample injection volume 20 μL, ultraviolet detector wavelength 254 nm,mobile phase flow rate 1 mL min⁻¹, and mobile phase MeOH/2 mM CuSO₄ =10/90 (v/v).

Then, the detection limit of the fluorescence detector 22 isinvestigated. Firstly high concentration amino acid enantiomers standardsolutions are prepared then diluted to 0 μg mL⁻¹, 0.25 μg mL⁻¹, 0.5 μgmL⁻¹, 1.0 μg mL⁻¹, 2.5 μg mL⁻¹, and 5.0 μgmL⁻¹ and each concentration ofstandard solution is measured for 5 times in which the lowest 4concentrations of standard solution are selected to prepare thecalibration curve. The detection limit is determined from thecalibration curve. Each of the 20 common dextrorotatory (D) andlevorotatory (L) enantiomers of amino acids is used to make thecalibration curves, respectively. The results show that the detectionlimit of the fluorescence detector 22 is between 0.1-0.2 μg mL⁻¹, whichis superior to the ultraviolet detectors used in the literatures.

To investigate the sensitivity of the mass spectrometer 24, the presentinvention uses reduced form glutathione (formed by glutamic acid,cysteine, and glycine) and two kinds of hydrolyzed dipeptide (Cys-Glyand γ-Glu-Cys) to prepare the external standard calibration curve, andthe lowest 5 concentrations (0, 1.0, 2.5, 5.0, 7.5 μg mL⁻¹) are used tomake the calibration curves and each standard solution is measure 3times. The detection limit and the quantitative limit are determinedfrom the calibration curves. The results show that the detection limitand the quantitative limit of glutathione are 0.9 and 3.1 μg mL⁻¹,respectively, and 1.1 and 3.6 μg mL⁻¹ for Cys-Gly, and 0.9 and 3.1 μgmL⁻¹ for γ-Glu-Cys.

This invention uses a self-designed 20-well array reactor for thehydrolysis reaction. The hydrolysis reaction may take 1-4 days at apredetermined temperature. Table 3 lists the analysis result of thehydrolyte of glutathione from 1 day to 4 days hydrolysis at 90° C. Inthe preferred embodiment, glutathione is hydrolyzed for 1 day and thehydrolyte is used to determine the amino acid sequence.

TABLE 3 3D-HPLC-FD system 3D-HPLC-ESI-MS system Temp Time Gly RSD L-GluRSD D-Glu RSD Cys-Gly RSD Glu-Cys RSD Glutathione RSD ° C. (day) (ppm)(%) (ppm) (%) (ppm) (%) (ppm) (%) (ppm) (%) (ppm) (%) 90 1 2.2 ± 0.1 4.73.5 ± 0.1 2.8 — — 3.1 ± 0.6 19.1 1.2 ± 0.2 21.8 11.5 ± 2.1  17.9 90 26.2 ± 0.4 6.3 6.8 ± 0.4 6.4 — — 4.1 ± 0.7 16.8 1.4 ± 0.2 18.7 5.4 ± 1.222.2 90 3 8.2 ± 0.1 1.6 11.3 ± 0.4  3.7 — — 6.5 ± 0.8 12.3 1.4 ± 0.432.1 4.1 ± 0.7 17.3 90 4 13.3 ± 0.2  1.6 10.6 ± 0.3  2.8 0.6 ± 0.1 14.26.4 ± 0.7 11.1 1.8 ± 0.3 12.4 1.7 ± 0.4 23.7

In another embodiment of this invention, aspartame is used as thepolypeptide to determine its amino acid sequence. Aspartame is adipeptide constituted by aspartic acid (Asp) and phenylalanine (Phe).Table 4 lists the quantitative analysis of its hydrolyte at 90° C. and1-4 days reaction period. In the preferred embodiment, Aspartame ishydrolyzed for 1 day and the hydrolyte is used to determine the aminoacid sequence.

TABLE 4 3D-HPLC-FD system 3D-HPLC-ESI-MS system Temp Time L-Asp RSDD-Asp RSD L-Phe RSD D-Phe RSD Aspartame RSD ° C. (day) (ppm) (%) (ppm)(%) (ppm) (%) (ppm) (%) (ppm) (%) 90 1 3.7 ± 0.1 2.6 — — 3.1 ± 0.2 6.5 —— 11.4 ± 1.4  21.8 90 2 10.8 ± 0.3  2.4 2.9 ± 0.1 3.5 5.7 ± 0.2 3.5 — —5.2 ± 0.8 18.7 90 3 10.6 ± 0.3  2.7 3.1 ± 0.1 3.4 6.8 ± 0.3 4.2 — — 2.1± 0.4 32.1 90 4 8.7 ± 0.2 2.2 2.8 ± 0.1 3.5 6.4 ± 0.3 4.8 — — — —

After the amino acid enantiomers of the hydrolyte are identified by thesecond column 12, the ESI-mass spectrometer 20 is used to measure themolecular weight of the short peptides of the hydrolyte from theobtained mass spectra signal (m/z). The amino acid enantiomersidentified by the second column 12 are combined to construct anypossible short peptides in an order from the smallest molecular weightdipeptide to higher molecular weight short peptides, and the correctshort peptides is confirmed by matching the molecular weight signal(m/z) obtained from the mass spectrometry. The confirmed possible shortpeptides are combined to construct any possible longer peptides andconfirmed by the molecular weight signal (m/z) of mass spectrometry. Theprocedure is repeated until the correct amino acid sequence is found.The procedure can also be assisted by computer program. The followingtwo examples respectively illustrate the procedure used to determine theamino acid sequence of glutathione and aspartame.

The reduced form glutathione is a tripeptide constituted by L-glutamicacid, L-cysteine, and glycine. Firstly, the qualitative analysis of thehydrolyte using the second column 12 identifies glycine and L-glutamicacid. Because the second column cannot identify L-cysteine, themolecular weight signal (m/z) of mass spectrometry is used toinvestigate if L-cysteine is present. Since the molecular weight signal(m/z) of mass spectrometry shows a signal with mass-to-charge ratio(m/z) 122.1 corresponding to cysteine, it is confirmed that glutathionehas three amino acid, i.e., glycine, L-glutamic acid, and L-cysteine.

After that, the identified amino acids are combined to construct anypossible dipeptides. If X, Y, and Z denote L-glutamic acid (Glu),L-cysteine (Cys), glycine (Gly), respectively, then the possibledipeptides include XX, YY, ZZ, XY, YX, YZ, ZY, XZ, and ZX. Since themolecular weight signal (m/z) of mass spectrometry did not showdipeptides constituted with same amino acids, Table 5 lists only the 6molecular weight signal (m/z) of mass spectrometry of dipeptidefragments in the hydrolyte constituted by different amino acids. Bycomparing the molecular weight signal (m/z) of mass spectrometry, thedipeptides XY (Glu-Cys, m/z=251.3) and YZ (Cys-Gly, m/z=179.32) areconfirmed.

TABLE 5 MS(+) fragment Dipeptide (cnts) [Glu-Cys] [GluCys + H]⁺[GluCys + Na]⁺ [GluCys − Cys]⁺ (250.3 Da) (m/z 251.3) (m/z 273.3) (m/z130.1)  904  467 584 [Cys-Glu] [CysGlu + H]⁺ [CysGlu + Na]⁺ [CysGlu −Glu]⁺ (250.3 Da) (m/z 251.3) (m/z 273.3) (m/z 104.1)  904  467 455[Glu-Gly] [Glu-Gly + H]⁺ [GluGly + Na]⁺ [GluGly − Gly]⁺ (204.2 Da) (m/z205.2) (m/z 227.2) (m/z 130.1) —  586 584 [Gly-Glu] [GlyGlu + H]⁺[GlyGlu + Na]⁺ [GlyGlu − Glu]⁺ (204.2 Da) (m/z 205.2) (m/z 227.2) (m/z58.1) —  586 — [Gly-Cys] [GlyCys + H]⁺ [GlyCys + Na]⁺ [GlyCys − Cys]⁺(178.2 Da) (m/z 179.2) (m/z 201.2) (m/z 58.1) 1669 1330 — [Cys-Gly][CysGly + H]⁺ [CysGly + Na]⁺ [CysGly − Gly]⁺ (178.2 Da) (m/z 179.2) (m/z201.2) (m/z 104.1) 1669 1330 455

The confirmed dipeptides XY and YZ are combined to construct anypossible tripeptides. There is only one possible tripeptide, i.e., XYZ(Glu-Cys-Gly) and is confirmed by the molecular weight signal(m/z=308.3) of mass spectrometry. Then, the confirmed dipeptides XY andYZ and tripeptide XYZ are combined to construct any possibletetrapeptides; however, no molecular weight signal (in/z) of massspectrometry to show any possible tetrapeptide. Then, the confirmeddipeptides XY and YZ and tripeptide XYZ are combined to construct anypossible pentapeptides. The possible pentapeptides include XYZXY, XYXYZ,XYZYZ, and YZXYZ. However, none of the possible pentapeptides can matchthe molecular weight signal (m/z) of mass spectrometry. Finally, theconfirmed dipeptides XY and YZ and tripeptide XYZ are combined toconstruct any possible hexapeptides. The only possible hexapeptide isXYZXYZ, which cannot match the molecular weight signal (m/z) of massspectrometry. Therefore, it is confirmed that the polypeptide is atripeptide. Table 3 lists all tripeptides formed by Glu, Cys, and Glyand their mass fragment molecular signal. By comparing the mass fragmentmolecular signal, it is judged that the following two tripeptides arematched:

TABLE 6 Tripeptide MS(+) fragment (cnts) [Glu-Cys-Gly] [M + H]⁺ [M +Na]⁺ [M − Gly]⁺ [M − CysGly]⁺ (307.3 Da) (m/z 308.3) (m/z 330.3) (m/z233.3) (m/z 130.1) 16559 2807 9387 36411 [Cys-Glu-Gly] [M + H]⁺ [M +Na]⁺ [M − Gly]⁺ [M − GluGly]⁺ (307.3 Da) (m/z 308.3) (m/z 330.3) (m/z233.3) (m/z 104.1) 16559 2807 9387 — [Glu-Gly-Cys] [M + H]⁺ [M + Na]⁺ [M− Cys]⁺ [M − GlyCys]⁺ (307.3 Da) (m/z 308.3) (m/z 330.3) (m/z 187.2)(m/z 130.1) 16559 2807 7480 36411 [Gly-Glu-Cys] [M + H]⁺ [M + Na]⁺ [M −Cys]⁺ [M − GluCys]⁺ (307.3 Da) (m/z 308.3) (m/z 330.3) (m/z 187.2) (m/z58.1) 16559 2807 7480 — [Gly-Cys-Glu] [M + H]⁺ [M + Na]⁺ [M − Glu]⁺ [M −CysGlu]⁺ (307.3 Da) (m/z 308.3) (m/z 330.3) (m/z 161.2) (m/z 58.1) 165592807 4871 — [Cys-Gly-Glu] [M + H]⁺ [M + Na]⁺ [M − Glu]⁺ [M − GlyGlu]⁺(307.3 Da) (m/z 308.3) (m/z 330.3) (m/z 161.2) (m/z 104.1) 16559 28074971 —

However, by checking the mass spectra fragment signal of dipeptideslisted in Table 5, it can be found only number 1 tripeptide, i.e.,Glu-Cys-Gly, is matched. Thus the amino acid sequence of the polypeptideis confirmed as Glu-Cys-Gly.

In another example, Aspartame is used as the polypeptide to determineits amino acid sequence. Aspartame is a methyl ester dipeptide formed byaspartic acid (Asp) and phenylalanine (Phe) methyl ester. In thisexample, Aspartame is hydrolyzed to un-hydrolyzed aspartame, L-asparticacid, L-phenylalanine, and methanol.

Firstly, the polypeptide can be identified by the second column 12 tohave two kinds of amino acid enantiomers, L-aspartic acid andL-phenylalanine. In addition, the molecular weight signal (m/z) of massspectrometry of the hydrolyte obtained from the mass spectrometer 24cannot find a mass-to-charge ratio (m/z) 122.1 corresponding tocysteine. Therefore, it confirms that aspartame has only two constituentamino acids, L-aspartic acid (Asp) and L-phenylalanine (Phe).

Then, L-aspartic acid (Asp) and L-phenylalanine (Phe) are combined toconstruct any possible dipeptides. If X and Y denote L-aspartic acid andL-phenylalanine, respectively, then the possible dipeptides includes XX,YY, XY, and YX. By comparing with the molecular weight signal (m/z) ofmass spectrometry, the confirmed present dipeptides is XY (Asp-Phe,m/z=280.3). However, the mass fragment signal is weak and it is deducedthat some other group may modify this dipeptide. By trial-and-error,some common groups are used to modify XY, and the modified dipeptide XYis checked if the molecular weight signal (m/z) of mass spectrometry canbe matched. This is a troublesome work. Finally, a modified XY,Asp-Phe-OCH₃ is confirmed by the molecular weight signal (m/z) of massspectrometry and it is determined the amino acid sequence of thepolypeptide is Asp-Phe-OCH₃. Table 7 lists the mass fragment signals ofdipeptides in this example.

TABLE 7 Dipeptide MS(+) fragment [Asp-Phe] [Asp-Phe + H]⁺ [AspPhe + Na]⁺[AspPhe − Phe]⁺ [Phe + H]⁺ (280.3 Da) (m/z 281.3) (m/z 303.3) (m/z116.2) (m/z 166.1)  4079 1693 1140 7593 [Phe-Asp] [PheAsp + H]⁺[PheAsp + Na]⁺ [PheAsp − Asp]⁺ [Asp + H]⁺ (280.3 Da) (m/z 281.3) (m/z303.3) (m/z 148.1) (m/z 134.2)  4079 1693 1370 1930 [Asp-Phe]ME[AspPhe + H]⁺ [AspPhe + Na]⁺ [AspPhe − Phe]⁺ [Phe + H]⁺ (294.3 Da) (m/z295.3) (m/z 317.2) (m/z 116.2) (m/z 180.3) 10081 1033 1140 8761[Phe-Asp]ME [Phe-Asp + H]⁺ [PheAsp + Na]⁺ [PheAsp − Asp]⁺ [Asp + H]⁺(294.3 Da) (m/z 295.3) (m/z 317.2) (m/z 148.1) (m/z 134.2) 10081 10331370 1930

Accordingly, this invention develops a three-dimensional HPLC systemwith an ion trap mass spectrometer, for determining amino acid sequenceof a protein or a polypeptide. The principle described in the aboveexamples can apply to any other proteins or polypeptides.

The detection limit of the fluorescence detector 22 used in the systemis about 0.1-0.2 μg mL⁻¹ with the relative standard deviation (RSD)about 1.6-6.5%, and the detection limit of the mass spectrometer 24 isabout 0.9-1.1 μg mL⁻¹ with RSD about 17.3-23.7%, revealing excellentsensitivity and accuracy.

The determination procedure of the present invention is a“small-to-large” procedure. The constituent amino acids are firstlyconfirmed, then constructing any possible dipeptides by the constituentamino acids and confirming them by the molecular weight signal (m/z) ofmass spectrometry. Continually, from the confirmed dipeptides, possiblelarger peptides of tripeptide, tetrapeptide, pentapeptide and so on, inan order from small molecular weight to large molecular weight, areconstructed and confirmed by matching the molecular weight signal (m/z)of mass spectrometry. In addition, because the enantiomers of aminoacids and amino acid isomers can be separated by the second column 12,the determined sequence can be 100% accurate. Noticed that conventionalart uses “large-to-small” determination procedure, which is differentfrom that of the present invention. In addition, a database isunnecessary for the determination procedure of the present invention,and the procedure can be assisted by a computer. Accordingly, thepresent invention provides systems and methods for determining the aminoacid sequence of a protein or polypeptide in an effective and fastmanner.

Although specific embodiments have been illustrated and described, itwill be appreciated by those skilled in the art that variousmodifications may be made without departing from the scope of thepresent invention, which is intended to be limited solely by theappended claims.

What is claimed is:
 1. A system for determining the amino acid sequenceof a protein or a polypeptide, in which the protein or the polypeptideis hydrolyzed to a hydrolyte comprising amino acid enantiomers, shortpeptides formed by the amino acid enantiomers, and un-hydrolyzed proteinor un-hydrolyzed polypeptide, and the system comprises: a first columnconnecting to an ultraviolet detector so as to separate the amino acidenantiomers and short peptides; a second column connecting to afluorescence detector so as to identify the amino acid enantiomers; anda third column connecting to a mass spectrometry so as to identify theshort peptides and obtain a molecular weight signal (m/z) of massspectrometry; wherein the identified amino acid enantiomers are combinedto construct one or more possible short peptides in an order from thesmall molecular weight to large molecular weight, and the one or morepossible short peptides is confirmed by matching the molecular weightobtained from the molecular weight signal (m/z) of mass spectrometry, soas to determine the amino acid sequence of the protein or thepolypeptide.
 2. The system as recited in claim 1, wherein the secondcolumn is capable of identifying amino acid enantiomers and isomers. 3.The system as recited in claim 1, wherein the mass spectrometer is anion-trap mass spectrometer with an Electrospray Ionization Interface(ESI).
 4. The system as recited in claim 1, wherein the first column isan affinity chiral column using copper sulphate/methanol solution as itsmobile phase.
 5. The system as recited in claim 1, wherein the secondcolumn is a ligand-exchange column using copper sulphate/methanolsolution as its mobile phase.
 6. The system as recited in claim 5,wherein the volume ratio of copper sulphate-to-methanol is 90/10.
 7. Thesystem as recited in claim 1, wherein the third column is areversed-phase column using 100% methanol as its mobile phase.
 8. Amethod for determining the amino acid sequence of a protein or apolypeptide, comprising the steps of: hydrolyzing the protein or thepolypeptide to a hydrolyte comprising amino acid enantiomers, shortpeptides formed by the amino acid enantiomers, and un-hydrolyzed proteinor un-hydrolyzed polypeptide; separating the amino acid enantiomers andthe short peptides; identifying the amino acid enantiomers; using a massspectrometry to identify the short peptides and obtain a molecularweight signal (m/z) of mass spectrometry; and constructing one or morepossible short peptides in an order from the small molecular weight tolarge molecular weight, and the one or more possible short peptides areconfirmed by matching the molecular weight obtained from the molecularweight signal (m/z) of mass spectrometry, so as to determine the aminoacid sequence of the protein or the polypeptide.
 9. The method asrecited in claim 8, wherein the mass spectrometer is an ion-trap massspectrometer with an Electrospray ionization Interface (ESI).
 10. Themethod as recited in claim 8, wherein the amino acid enantiomers andisomers are also identified.